home *** CD-ROM | disk | FTP | other *** search
-
- CXSUB routines
- --------------------------------------------------------------------------
- As you know, Cx provides a very low level interface to data compression.
- Many application designers, however, may be able to use a higher level
- interface. The CXSUB routines provide a high level, application
- independent interface to Cx data compression. The CXSUB routines have
- been carefully designed to allow easy integration into existing
- applications. You may be able to use the CXSUB routines in your
- applications, but if not, they may be instructive in explaining
- the usage of Cx.
-
-
- The Source Code
- --------------------------------------------------------------------------
- Source code for the CXSUB routines is found in the files:
-
- CXSUB.C - C source code
- CXSUB.H - C header file
- CXSUB.PAS - Turbo Pascal source code
- VBCXSUB.BAS - Visual BASIC source code
-
-
- Programming Interface
- --------------------------------------------------------------------------
-
- CXSUB Error Codes
- ------------------------------------------------------------------
- CXSUB_ERR_OPENS - Could not open source.
- CXSUB_ERR_OPEND - Could not open destination.
- CXSUB_ERR_NOMEM - Insufficient memory.
- CXSUB_ERR_READ - Could not read from source.
- CXSUB_ERR_WRITE - Could not write to destination.
- CXSUB_ERR_CLOSE - Could not close destination.
- CXSUB_ERR_INVALID - source file is invalid or corrupt
-
-
- cx_error_message(error)
- ------------------------------------------------------------------
- PURPOSE:
- Return an English error string from a Cx or CXSUB error.
-
- PARAMETER:
- error - error code (CX_ERR* or CXSUB_ERR*)
-
- RETURN:
- An English error message, or "unknown" if the error code is
- unknown.
-
-
- cx_compress_file(dst, src, method, bsize, tsize)
- ------------------------------------------------------------------
- PURPOSE:
- Compress any size or type of file to another file.
-
- PARAMETERS:
- dst - destination file name
- src - source file name
- method - Compression method (CX_METHOD*)
- bsize - compression buffer size (1-CX_MAX_BUFFER)
- tsize - temporary buffer size (CX_C_MINTEMP-CX_D_MINTEMP)
-
- RETURN:
- CX_ERR_* - Cx error.
- CXSUB_ERR_* - CXSUB error.
- 0 - No error.
-
- NOTES:
- For maximum compression specify bsize and tsize as large as possible.
-
- See section 'CXSUB Single File Compression' for more information.
-
-
- cx_decompress_file(dst, src)
- ------------------------------------------------------------------
- PURPOSE:
- Decompress a file compressed with cx_compress_file.
-
- PARAMETERS:
- dst - destination file name
- src - source file name
-
- RETURN:
- CX_ERR_* - Cx error.
- CXSUB_ERR_* - CXSUB error.
- 0 - No error.
-
- NOTES:
- If dst is not specified (NULL in C, '' in Pascal, "" in Visual
- BASIC), an integrity check only will be performed.
-
- See section 'CXSUB Single File Compression' for more information.
-
-
- cx_compress_ofile(ofile, ifile, method, bsize, tsize)
- ------------------------------------------------------------------
- PURPOSE:
- Compress any size or type of file to another file, with files
- previously opened.
-
- PARAMETERS:
- ofile - opened output file
- ifile - opened input file
- method - Compression method (CX_METHOD*)
- bsize - compression buffer size (1-CX_MAX_BUFFER)
- tsize - temporary buffer size (CX_C_MINTEMP-CX_D_MINTEMP)
-
- RETURN:
- CX_ERR_* - Cx error.
- CXSUB_ERR_* - CXSUB error.
- 0 - No error.
-
- NOTES:
- For maximum compression specify bsize and tsize as large as possible.
-
- See section 'CXSUB Single File Compression' for more information.
-
-
- cx_decompress_ofile(dst, src)
- ------------------------------------------------------------------
- PURPOSE:
- Decompress a file compressed with cx_compress_(o)file, with
- files previously opened.
-
- PARAMETERS:
- ofile - opened output file
- ifile - opened input file
-
- RETURN:
- CX_ERR_* - Cx error.
- CXSUB_ERR_* - CXSUB error.
- 0 - No error.
-
- NOTES:
- See section 'CXSUB Single File Compression' for more information.
-
-
- CXSUB Single File Compression (SFC)
- --------------------------------------------------------------------------
- This section contains general and language specific information about
- the following CXSUB functions:
-
- cx_compress_file - file name interface
- cx_decompress_file - file name interface
-
- cx_compress_hfile - file handle interface
- cx_decompress_hfile - file handle interface
-
-
- Overview
- ---------------------------------------------------------------------
- The CXSUB Single File Compression (SFC) routines provide an easy way to
- compress and decompress one file to another.
-
- There are two interfaces. One is based on file names. Using this
- interface is not much harder than specifying:
-
- "Compress file A to file B" or
- "Decompress file B to file C"
-
- Of course, the decompression routine will only work on files compressed
- with the compression routine. The other interface is based on file
- handles. A file handle is simply a way to reference an open file.
- This interface is provided to allow for future routines based on the
- SFC routines. It is possible, for example, to design an archive file
- format that uses the handle based interface.
-
- All of the provided SFC source code writes and reads the same file
- format.
-
-
- File Format
- ---------------------------------------------------------------------
- The file format is a sequence of variable length 'blocks'. Blocks are
- produced by reading data from a file to be compressed. The amount of
- data read in each pass is known here as the 'original buffer size' or
- BSIZE.
-
- If, for example, you are compressing a 1000 bytes file, and BSIZE is
- 100 bytes, 10 blocks will be produced. BSIZE is a parameter to the
- file compression routines (parameter bsize).
-
- A block has 4 pieces of information:
-
- 2 bytes - original buffer size (BSIZE)
- 2 bytes - compressed buffer size (CSIZE)
- 2 bytes - 16 bit CRC (from CX_CRC) (DATACRC)
- CSIZE bytes - (DATA)
-
- The relation between these 4 pieces of information is:
-
- if BSIZE is the same as CSIZE, the original buffer could not be
- compressed. DATA contains uncompressed data.
-
- if BSIZE is not the same as CSIZE, the original buffer was successfully
- compressed. CSIZE will be strictly less than BSIZE. DATA contains
- compressed data.
-
- DATACRC is a 16 bit CRC computed on DATA. Note that this means
- DATACRC is computed on compressed data.
-
- To indicate the end of a compressed file, an abbreviated block is stored.
- The abbreviated block is simply:
-
- 2 bytes - original buffer size (0)
-
- As an example, compressing a 25 byte file with BSIZE equal to 10, where:
-
- bytes 0...9 compress to 7 bytes
- bytes 10..19 can't be compressed
- bytes 20..25 compress to 2 bytes
-
- The file data produced from the SFC compression routines will be:
-
- ------------------------------
- 2 bytes - 10 block 1
- 2 bytes - 7
- 2 bytes - DATACRC
- 7 bytes - compressed data
- ------------------------------
- 2 bytes - 10 block 2
- 2 bytes - 10
- 2 bytes - DATACRC
- 10 bytes - uncompressed data
- ------------------------------
- 2 bytes - 5 block 3
- 2 bytes - 2
- 2 bytes - DATACRC
- 2 bytes - compressed data
- ------------------------------
- 2 bytes - 0 block 4, Abbreviated end of file block
-
- Of course, you would typically use a BSIZE much larger than 10 bytes.
- For maximum compression, you would use a BSIZE of CX_MAX_BUFFER.
-
-
- Motivation / Questions / Expanding or Improving the SFC routines
- ---------------------------------------------------------------------
- The following questions and answers may provide insight into the
- SFC functions.
-
- Q: Why are bsize and tsize parameters? For maximum compression,
- bsize should always be CX_MAX_BUFFER and tsize should always be
- CX_C_MAXTEMP.
- A: Some applications want or need to minimize memory usage. By
- keeping bsize and tsize parameters, the application can balance
- memory usage and compression size.
-
-
- Q: Why are both BSIZE and CSIZE stored?
- A: By storing both, it is possible to handle uncompressable data.
- If BSIZE is equal to CSIZE, the stored buffer is known to be
- uncompressed.
-
-
- Q: Why is a CRC computed on the compressed buffer as opposed to the
- original buffer?
- A: Testing has determined that a CRC on compressed buffers is better
- able to detect errors than a CRC on original buffers. In addition,
- as compressed buffers are typically smaller than original buffers,
- a CRC on a compressed buffer is quicker to compute.
-
-
- Q: Why is the last block abbreviated?
- A: Simply to save space. By abbreviating the final block, it is
- possible to save 4 bytes of storage for each compressed file.
- Note, however, that this is a fairly arbitrary decision. As
- file I/O calls consume time, it may be desirable to store a
- 'complete' block. This would eliminate up to 2 file I/O calls
- per block when decompressing.
-
-
- Q: Why isn't the compression method stored?
- A: CX_DECOMPRESS can decompress any buffer compressed with CX_COMPRESS
- without knowing beforehand the specific compression method used.
-
-
- Q: What if I wanted to store an original files time stamp and/or
- name in a compressed file?
- A: This would be a fairly easy addition. You could add a header
- to the SFC file format. As an example:
-
- 4 bytes - files time stamp
- 1 byte - name length (NAMELEN)
- NAMELEN bytes - file name
-
- The cx_compress_file and cx_decompress_file functions could be
- modified to write, read and use this header information. The only
- additional routines you would have to call (included in most
- languages) are for reading and writing a files time stamp.
-
-
- Q: What if I wanted to extract valid data from a corrupt compressed
- file?
- A: This could be accomplished by expanding a block. Instead of:
-
- 2 bytes - original buffer size (BSIZE)
- 2 bytes - compressed buffer size (CSIZE)
- 2 bytes - 16 bit CRC (from CX_CRC) (DATACRC)
- CSIZE bytes - (DATA)
-
- You could specify:
-
- 4 bytes - header like '$CX$'
- 4 bytes - physical file location (POS)
- 2 bytes - original buffer size (BSIZE)
- 2 bytes - compressed buffer size (CSIZE)
- 2 bytes - 16 bit CRC (from CX_CRC) (DATACRC)
- CSIZE bytes - (DATA)
-
- With a corrupt file, you could search the file for the block header
- ($CX$). After finding a header, you would have all the information
- you need to extract a valid original buffer. If there are errors
- when decompressing a block, you would know it is invalid. Note
- that smaller BSIZE's will have more potential for recovery as each
- block will effect less data.
-
-